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ABSTRACT 

The linear algorithm of the Wiener filter and constrained realizations (CRs) of Gaus- 
sian random fields is extended here to perform non-linear CRs. The procedure consists 
of: (1) Using low resolution data to constrain a high resolution realization of the under- 
lying field, as if the linear theory is valid; (2) Taking the linear CR backwards in time, 
by the linear theory, to set initial conditions for N-body simulations; (3) Forwarding the 
field in time by an N-body code. An intermediate step is introduced to 'linearize' the low 
resolution data. 

The non-linear CR can be applied to any observational data set that is quasi-linearly 
related to the underlying field. Here it is applied to the IRAS 1.2Jy catalog using 846 
data points within a sphere of 6000 /cm/s, to reconstruct the full non-linear large scale 
structure of our 'local' universe. The method is tested against mock IRAS surveys, taken 
from random non-linear realizations. A detailed analysis of the reconstructed non-linear 
structure is presented. 

Subject headings: cosmology: large-scale structure of universe, methods: statistical 



^ E-mail: bisto@vms.huji.ac.il 
^ E-mail: hoffman@vms.huji.ac.il 



I. Introduction 



In the standard model of cosmology galaxies and the large scale structure of the uni- 
verse form out of a random perturbation field via gravitational instability. It is assumed 
that the primordial perturbation field constitutes a random homogeneous and isotropic 
Gaussian field and that on relevant scales its amplitude is small, hence its dynamics is de- 
scribed by the linear theory of gravitational instability (c/. Peebles 1980). The theoretical 
study of structure formation has been a major effort of modern cosmology (c/. Padman- 
abhan 1993). On the observational side, the large scale structure has been studied mostly 
by means of red-shift surveys (c/. Strauss and Willick 1995) and peculiar velocities (c/. 
Dekel 1994). A method for the reconstruction of the underlying dynamical (density and 
velocity) fields from a given observational data base is presented here. 

The problem of recovering the underlying field from given observations, which by 
their nature are incomplete and have a finite accuracy and resolution, is one often encoun- 
tered in many branches of physics and astronomy. It has been shown that for a random 
Gaussian field an optimal estimator of the underlying field is given by a minimal variance 
solution (Zaroubi, Hoffman, Fisher and Lahav 1995; ZHFL), known also as the Wiener 
filter (hereafter WF, Wiener 1949, Press et al. 1992). This approach is based on the as- 
sumed knowledge of the second moments of the random field. These moments, also known 
as co-variance matrices, are to be deduced directly from the data, or to be calculated 
from an assumed model, the so-called prior. Within the framework of Gaussian fields the 
WF coincides with the Bayesian posterior axid the maximum entropy estimations (ZHFL). 
Indeed, in the cosmological case where on large enough scales the linear theory applies 
and the (over) density and velocity fields are Gaussian the WF is the optimal tool for the 
reconstruction of the large scale structure. This is further complemented by the algorithm 
of constrained realizations (CRs) of Gaussian fields (Hoffman and Ribak 1991) to create 
Monte Carlo simulations of the residual from this optimal estimation. This combined 
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WF/CR approach has been apphed recently to a variety of cosmological data bases in an 
effort to recover the large scale structure. This includes the analysis of the COBE/DMR 
data (Bunn et al. 1994, Bunn, Hoffman and Silk 1996), the analysis of the velocity potential 
(Ganon and HoflFman 1993), the reconstruction of the density field (Hoffman 1993, 1994, 
Lahav 1993, 1994, Lahav et al. 1994, Webster, Lahav and Fisher 1996) and the peculiar 
velocity field (Fisher et al. 1995a) from the IRAS redshift survey (Fisher et al. 1993). It 
has also been recently applied to the MARKIII peculiar velocities (Willick et al. 1995) to 
reconstruct the underlying dynamical fields (Zaroubi, Hoffman and Dekel 1996). 

A major limitation of the WF/CR approach is that it applies only in the linear regime. 
Yet, on small scales the perturbations are not small and the full non-linear gravitational 
instability theory has to be used. Here the WF/CR method is extended to the quasi- linear 
regime, and a new algorithm of non-linear constrained realizations (NLCRs) is presented. 
The general method of WF and CRs and its modification to the case of quasi-linear data 
is presented in §11. The method is tested against N-body simulations and its application 
to the IRAS 1.2Jy catalog (Fisher et al. 1995b) is given in §111. The NLCRs of our 'local' 
universe are presented in §IV and a short discussion (§V) concludes the paper. 

II. Non Linear Constrained Realizations 

a. Linear Theory 

The general WF/CR method has been fully described in ZHFL and only a very short 
outline of it is presented here. Consider the case of a set of observations performed on 
an underlying random field (with degrees of freedom) s = {si,...,S7v} yielding M 
observables, d = {di, ...,(iM}- Here, only measurements that can be modeled as a linear 
convolution or mapping on the field are considered. The act of observation is represented 

by 

d = Rs + e, (1) 
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where R is a linear operator which represent a point spread function and e = {e^, sm} are 
the statistical errors. Here the notion of a point spread function is extended to include any- 
linear operation that relates the measurements to the underlying field. The WF estimator 
is (ZHFL): 

sW^ = (sdt)(ddt)"'d . (2) 

Here, ^...^ represents an ensemble average and ^sd^^ is the cross-correlation matrix of 
the data and the underlying field. The data auto correlation matrix is 

^dd^) =R^sst)Rt + {|eet^, (3) 

where the second term represents the statistical errors, i.e. shot noise. 

In the case of a random Gaussian field, the WF estimator coincides with the condi- 
tional mean field given the data. A CR of the random residual from this mean field is 
obtained by creating an unconstrained realization of the underlying field (s) and the errors 
(e), and 'observing' it the same way the actual universe is observed. Namely, a mock data 
base is created by: 

d = Rs + e, (4) 
A CR is then simply obtained by (Hoffman and Ribak 1991): 

sCR = s+^sdt^^ddt^"'(d-d). (5) 

b. Covariance Matrices and Shot Noise 

The WF/CR and the NLCR presented here can be used with any data base whose 
relation to the underlying field can be modeled by Eq. 1. Thus for example, observa- 
tions of the velocity field can be used to reconstruct the density field and vice versa. The 
concrete case studied here is the reconstruction of the continuous density field from a dis- 
crete galaxy catalog within a given volume subject to certain selection criteria. Here we 
assume the galaxy distribution in configuration space is known and red-shift distortions 
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are ignored. The formalism of WF/CR can be expressed in any functional representa- 
tion, such as Fourier or spherical harmonics/Bessel functions. The choice of the particular 
representation is usually dictated by the geometry of the observations. In the case of a 
full sky coverage and radial selection function the obvious choice is the spherical harmon- 
ics/Bessel basis (Fisher et al. 1995a, Webster, Lahav and Fisher 1996). However, in the 
case of an incomplete sky survey the so-called zone of avoidance (ZOA) couples the spher- 
ical harmonics and this results in a complicated covariance matrix. For the general case 
we choose here to use the configuration space representation and the field is evaluated on 
a Cartesian grid. The estimation of the continuous field is done by smoothing, i.e. the 
convolution of the discrete galaxy distribution with a certain kernel. The dicreteness of 
the galaxies introduces shot-noise errors and the smoothing procedure cause these errors 
to be correlated to each other. The smoothing kernel depends on the nature of the data 
and is determined by a compromise between two conflicting considerations, namely high 
resolution and low noise level. A high resolution is achieved by using a narrow smoothing 
kernel, and the noise level is reduced by widening the kernel. Here a Gaussian filter is used 
with two smoothing length radii, Rl and Rs- The data is smoothed on a large scale Rl 
and high resolution CRs are created on the scale thus Rl > Rs- 

An estimator of the fractional over-density at the point is given by 

gal 

where n is the mean number density of the galaxies and (p{r) is the data selection function. 
The data autocorrelation function is written as ^Aq,A^^ = ^q;/3 + (^af3- The first term is 
just the autocorrelation function of the smoothed field ( ^*(r) ), 

^ap = e{\ra-rp\) = -^ j P(A;)exp(-(A;i?i,)2)exp(ik-(r,-r^))d3A;, (7) 

and the shot noise covariance matrix is: 

1 [ 1 / (r„-x)='+(rg-x)% _,3 



(The derivation of the error matrix follows Scherrer and Bertschinger 1991.) Note that 
the kernel introduces off-diagonal terms in the error covariance matrix (ZHFL). The cross- 
correlation of the high resolution field and the low resolution data is: 

Uri) = (2^/ PWexp(-^^^^l^±i^)exp(ik. (r, -r«))d^/c (9) 

Defining the WF operator Wip, 

Wip = ^a{ri){^al3 + (Ta(3) \ (10) 

a linear high resolution realization is now obtained by 

5(r,) = 5(ri) + W,^(A^-A/3). (11) 

c. Quasi-Linear Approximation 

In the bottom-up model of gravitational clustering small scales go non-linear before 
large ones. Hence, our basic approach here is to smooth the data on a scale Rl which is 
large enough that the S field, smoothed on that scale, is approximately linear. Given an 
estimator of such a linear observable, the linear formalism of §IIa can be used to make high 
resolution CRs, as if the linear theory is valid on these small scales. However, as it will be 
shown below, numerical N-body simulations show that even for quite high Rl smoothing, 
the resulting field has undergone some non-linear evolution. Thus, the linear procedure of 
the WF/CR has to be supplemented by an additional step of 'linearizing' the input data, 
i.e. mapping the present epoch data back to the linear regime. Various algorithms have 
been proposed to trace back non- linear perturbation field to the linear regime (c/. Strauss 
and Willick 1995). All of these 'time machines' recover the initial linear field in the case 
where the quasi-linear field is known exactly, with no statistical uncertainty. The case of 
real observational data where the shot noise increases with distance, poses a much more 
difficult problem. As the density field is sampled further away it becomes more dominated 
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by the shot noise and in the mean its amphtude increases with distance. Thus, a procedure 
has to be developed that accounts for the statistical noise, separately from the non-linear 
effects. 

The analysis of the various non-linear effects and the test of possible approximations 
to the mapping from the non-linear to the linear regime is best done by an analysis of 
N-body simulations of non-linear gravitational clustering. As our aim here is to apply the 
method to the IRAS 1.2Jy catalog, N-body simulations with the IRAS power spectrum 
have been performed. Mock IRAS data sets are generated from these simulations. (Full 
description of the simulations is given in §111.) The non- linear evolution has been studied 
by smoothing the fully non- linear field, 5^^^ and the linear field, (5^, on the Rl scale. 
An examination of the {5^^, 5^) relation (Fig. 5a) shows both a scatter and systematic 
deviation from the desired 5^ = 5^^. The empirical fix to the 'non-linearity' of the 
smoothed data which is used here consists of two steps. First, to account for the scatter in 
the {5^^ , d^) relation a new term is introduced to the data auto-covariance matrix, a^^. 
Dealing with the scatter by statistical means is a manifestation of our inability to invert 
the exact non-local non-linear mapping from the linear to the quasi-linear regime. Here 
we go to the extreme simplification and take a^^ to be a scalar matrix. The value of this 
constant term is found to be of the order of 10~^ and it is determined by the requirement 

shot noise and a^^. A WF estimator of the i?i:-smoothed field is obtained by applying a 
WF on the data, where Rs is replaced by ii^ to obtain low resolution, 

Aa. (12) 

Rs=Rl 

The estimation of the quasi-linear correction is given by (j^^'Q^ — / (5^^'Q^)) where f{6) 
is a polynomial fitting to the curve shown in Fig. 5a. This correction is evaluated at grid 
points Fq, and is used to correct the data points: 

= Aa - ((JW^'Q^ - / (^WF'QL)) . (13) 
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^WF,QL^r,) = 



The modified ('finearized') A^'s are now substituted in Eq. 11 to obtain a high resolution 
CR of the underlying linear field, given the actual data. 

The ad hoc linearization procedure presented here behaves correctly in the following 
limits. In the case of distant data points, where the data is dominated by shot noise, the 
WF attenuates the estimated field towards zero amplitude. The linearization opperation 
(Eq. 13) would hardly change the amplitude of the WF estimator (Eq.l2). The WF/CR 
is therefore dominated by the random residual, and consequently the resulting realization 
lies in the linear regime. For nearby data points where the shot noise is negligible, the WF 
leaves the signal almost untouched 5^^'^^ A. The procedure described here correctly 
recovers the Gaussian 1-point distribution function of the density field, however it will not 
correctly reconstruct the two point function. The procedure suggested here works only for 
data close to the linear regime. 

The mock data base of Eq. 4 that is used to sample the residual from the mean, has 
to be subjected to the same non-linear effects as the real data do. Thus, the mock data Aq, 
have to be sampled from a mock IRAS catalog drawn out of an N-body simulation, and 
then to be mapped to the linear regime in the same way the actual data are treated. The 
linear CRs thus generated are used to set the initial conditions for N-body simulations. 

III. Testing Against N-body Simulations 

Within the linear regime the WF/CR method provides a rigorous way of estimating 
the underlying field and making realizations that are consistent both with the data and the 
prior model. However, beyond the linear regime one should use various approximations to 
achieve such a goal. In particular, the mapping of the quasi-linear data to the linear regime 
has to be calibrated by testing it against N-body simulations. This empirical approach 
strongly depends on the nature of the data and the assumed model, and therefore it should 
be tested on mock data sets that are drawn from the full N-body systems in a manner that 
mimics the actual observations. It follows that unlike the linear regime reconstruction, 
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where a general WF/CR formalism can be formulated, in the quasi-linear domain the 
approach should be fitted to the problem at hand. 

Our aim is to apply the present method to the IRAS 1.2 Jy redshift survey. The 
prior model assumed here is a flat, = 1, CDM-like model with a shape parameter of 
r = 0.2, normalized by us = 0.7 and with no biasing. Four random linear realizations 
of this model were generated and used as initial conditions to a 128^ PM N-body code 
(written by E. Bertschinger) with a comoving box size of 1.6 x 10^ km/s, periodic boundary 
conditions and grid spacing of 125 km/ sec. The low resolution is defined by a Gaussian 
filter of smoothing length Rl = 1000 km/s (here distances are given in units of /cm/s)and 
the high resolution is limited by the Nyquist wavelength. The smoothed field is sampled 
within a sphere of radius 6000 km/s on a Cartesian grid at a sampling rate of Rl, yielding 
925 constraints. The smoothing is done by an FFT convolution on a count-in cells (CIC) 
of the particles done on the basic grid. In the numerical experiments no Galactic ZOA is 
assumed. 

A non-trivial task here is the calculation of the non-diagonal error correlation matrix. 
The symmetry of the problem leaves only three independent degrees of freedom out of the 
six (ri,r2) variables. The resulting integral is taken over a product of a Gaussian that 
peaks at ~ (ri + r2)/2 and the inverse of the selection function, which makes it a three- 
dimensional integral. The integral is calculated numerically on a finite cube of size 6Rl 
centered at ~ (ri + r2)/2. Special care has been taken to control the numerical errors, as 
the inverse of the auto-correlation matrix is found to be sensitive to the numerical noise. 
The matrix inversion is done by a Cholesky decomposition algorithm which is fast and 
stable (Press et al. 1992). 

The numerical simulations provide an ensemble of four random non-linear realizations 
of the prior model. The first step in analyzing the N-body simulation is the construction of 
the 'galaxy' distribution out of the 'dark matter' particles. No biasing is assumed here and 
a fraction of the particles are randomly tagged as galaxies, so at to reproduce the IRAS 
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mean number of galaxies. Here a mean number density of Ug = 5.48 x 10~^{km/s)~^ 
(Fisher 1992, Fisher et al. 1994) is used. Given a 'volume limited' mock IRAS catalog, 
a selection process is applied to generate a magnitude limited mock IRAS 1.2Jy catalog, 
based on the selection function: 

with Ts = 500 /cm/s, a = 0.492, (3 = 1.830, r* = blMkm/s (Yahil et al. 1991, Fisher 1992). 




Figure 1: Random fields: The linear fiuctuation density fields of the four random 
realizations given the prior as described in the text, smoothed with a lOOOfcm/s 
Gaussian window. The Supergalactic plane is shown, the units are in lOO/cm/s 
and the contour spacing is 0.2 in 8. The heavy contour belongs to 5 = while solid 
and dot contours represent overdensity and underdensity regions respectively. The 
circle of radius 6000 A;m/s marks the region from which constraints are taken. 
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From the four different realizations one is treated as the 'real' data and other three are 
used to make the NLCRs. These are labeled by 'R' and 1, 2 and 3 in the various figures. 
The four different linear random realizations are shown in Fig. 1. (In all figures the 
continuous 5 field is shown at the low resolution of i?L = 1000 km/ s with contour spacing 
of 0.2 in S). The 'R' non- linear realization has been sampled to produce a mock IRAS 
data base, from which the CRs are generated. The three linear CRs, which serve as initial 
conditions to the PM code, are shown in Fig. 2 where the R-realization is given as well. 
The success of the whole procedure is actually determined at this stage. Namely how close 
are the various 1, 2, 3 fields to the 'R' field. The general 'cosmography' is nicely recovered 
by the different CRs, with a little scatter. The smoothed density of the fully evolved 
NLCRs and the original 'R' field are shown in Fig. 3. Fig. 4 shows the 'R' volume limited 
non-linear 'galaxy' distribution, which is to be recovered by the three NLCRs. The bottom 
panel shows the magnitude limited 'galaxy' distribution, using the IRAS selection function, 
from which the constraints are drawn. Various objects that dominate the 'cosmography' 
of the simulations are tagged by capital letters A-F in Figs. 2, 3 & 4. The circle of radius 
6000 km/ s marks the region from which constraints are drawn. 

The quality of the reconstruction provided by the NLCRs can be judged from Figs. 
3 & 4. The main objects that define the cosmography of the 'R' simulation, within the 
sampling radius of ^ 6000A;m/s , are the A, B, D & E clusters and the extended void 
that occupies most of the sampling volume (C). The F denotes a filament running from 
A towards the object at SGX,SGY = (0,8000), which is clearly seen in the particle 
distribution but is hardly noticed in the smoothed field. At the level of the smoothed 
field, the NLCRs (1-3) recovers the R realization to within one or two contour lines. No 
systematic discrepancy between the recovered fields and the original one is found here. 
The comparison at the particle distribution level constitutes a much more challenging test 
to our method. The robust peak at A breaks into sub-clumps that are located at the 
intersection of a few filaments. The actual sub-clumps position and distribution vary in 
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Figure 2: Linear CRs from non-linear data and the linear 'R' field: The linear 
CRs (1, 2, 3) are constrained by data sampled from the fully evolved 'R' field. 
These are supposed to recover the linear 'R' field, which is presented here. The 
various objects that define the gross structure are marked by the letters A, B, 
C, D, E & F, and these are introduced for the sake of the comparison of the 
reconstructed fields with the original. The parameters of the plot are the same as 
in fig. 1. 

the NLCRs, but the overall statistical nature of the objects is recovered. As expected, the 
reconstruction of the filaments is less robust than the density peaks. The F filament that 
starts at one side of the A cluster and continues at its other side is reproduced by all the 
NLCRs. This is also the case with the filament that runs from cluster B towards the origin. 
However, there is quite a large scatter in the overall structure of the network of filaments. 
The conclusion that follows is that the NLCRs provide a faithful reconstruction of the 
non-linear density field at the 1000 km/s Gaussian smoothing level. The highly non-linear 
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small scale structure is formed within the correct 'boundary conditions' provided by the 
'actual universe' {i.e. the 'R' realization), but with some degree of random variability as 
dictated by the unconstrained random realizations used. 




SGX SGX 



Figure 3: NLCRs and the non-linear 'R' field: The smoothed non-linear evolved 
density fields of the real universe, 'R', and the ensemble of NLCRs(l,2 & 3). The 
NLCRs reconstruct the 'R' field, within the limitations of the method. 

The simulations are used to calibrate the non- linear correction of Eq.l3. A scatter 
plot of the {6^, 6^^) is shown in Fig. 5a (left panel), where both the bias and scatter are 
clearly seen. The non-linear mapping is applied to 5^^ to produce a corrected linear 5^^. 
Indeed, the scatter plot of (5^,5^^) (Fig. 5a, right panel) shows that the bias is removed 
without increasing the scatter (see also Nusser et al. 1991). The mapping recovers the 
normal 1-point distribution function (Fig. 5b) and the theoretical power spectrum (Fig. 
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Figure 4: Volume limited mock 'galaxy' catalogs: Slices of thickness of 2000 km/s 
of the volume limited 'galaxy' distribution normalized to the mean galaxy density 
of the IRAS 1.2Jy survey. The top left panel shows the 'galaxy' distribution 
of the 'R' field. Applying the IRAS selection function produces the magnitude 
limited sample that is shown in the bottom left panel. The mock survey is used 
to set constraints on the random realizations of the underlying field. The 'galaxy' 
distributions of the three NLCRs are shown here, and these are to be compared 
with the 'R' field. 

5c). Here only the dynamical aspects are studied, and no 'observational' uncertainties are 
considered. 

The mock realization has been used to simulate the effect of the improvement of the 
sampling on the quality of the reconstruction. The 'R' simulation has been re-sampled by 
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Figure 5: (a) Left: Scatter plot of linear vs. non-linear evolved density field 
smoothed with a lOOOfcm/s Gaussian filter. Right: The same plot after the ap- 
plication of a polynomial correction on the non-linear field, (b) The 1-point prob- 
ability distribution function of 6 (histogram) as calculated from the linear CR 
given corrected linear constraints. The solid line represents the theoretical Gaus- 
sian PDF. (c) The power spectrum of the linear CR (points). The dotted curve 
is the CDM-like power spectrum that is used as a prior for the reconstructions. 



the PSCZ selection function, which corresponds to an IRAS survey sampled down to a 0.6Jy 
fiux at 60//. (The PSCZ survey is an extension of the 0.6Jy QDOT l-in-6 survey, Saunders 
et al. 1990 , which is now being completed to all galaxies brighter than the above limit. For 
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details see http: //www-astro .physics . ox . ac.uk/ ^wjs/pscz .html.) The improvement 
in the errors is shown in Fig. 6 as they are normahzed to the rms value of the underlying 
field. Here, only one NLCR has been reconstructed, and the No.l realization is used. Fig. 
7 shows the smoothed 'R' realizations, to be reconstructed, the linear CR that serves as 
the initial conditions, the smoothed density field and particle distribution of the NLCR. A 
comparison of Fig. 7 with the Fig. 3 & 4 shows the improvement gained by the reduction of 
the shot-noise. This is clearly seen in the smoothed non-linear field, where the PSCZ-based 
reconstruction recovers the original field much better than the 1.2 Jy-based field. 



b 



b 
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Figure 6: The diagonal elements of the error correlation matrix, normalized to 
the rms value of the 1000 /cm/s smoothed field vs. the distance. The two curves 
show the variation of the errors in the case of the IRAS 1.2Jy and PSCZ selection 
functions. 



IV. Volume Limited IRAS Catalog 

The IRAS 1.2Jy catalog consists of 5321 galaxies. These are used to evaluate the 
smoothed density field on a Cartesian grid of 1000 /cm/s spacing within a 6000 /cm/s, 
excluding the ZOA, yielding 846 constraints. The NLCRs are created on a finer 128'^ grid 
of 125 /cm/s spacing, assuming periodic boundary conditions. However, for a CDM-like 
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Figure 7: PSCZ reconstruction of tlie mock catalog: Upper left: The non-linearly 
evolved 'R' field. Upper right: Linear CR from a PSCZlike survey of the 'R' field. 
The constraints are applied to the random realization # 1 of the ensemble of 
realizations. Lower left: The smoothed NLCR field for the PSCZ case. Lower 
right: Slice of thickness 2000 km/ s centered at the Supergalactic plane of the 
reconstructed volume limited PSCZ-like survey. 

power spectrum the structure within the 6000 km /s is hardly affected by the periodicity 
on the ±8000 km /s box. A word of caution should precede the analysis of the NLCRs. 
The neglect of the redshift distortions affects the reconstructions in two ways. One is the 
displacements of the objects by a few xlOO km/s, and the other is the amplification of the 
(over)density amplitude, due to the gravitational focusing (c/. Kaiser 1986). 

The actual IRAS survey is affected by Galactic ZOA of \b\ < 5°. Our choice of a 
Cartesian three dimensional representation is especially suited for handling zones of missing 
data, where the only correction to be applied is for the truncated Gaussian smoothing 
near the ZOA. Otherwise, no assumption is made on the completeness of the survey sky 
coverage. This is to be contrasted with the case of orthogonal representations such as 
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Figure 8: Linear CRs of tlie IRAS 1.2Jy survey: Tlie 'R' plot sliows tlie Su- 
pergalactic plane of the lOOOfcm/s smoothed raw IRAS 1.2Jy survey. The rest 
three plots show the linear CRs which are reconstructed by imposing constraints 
from the smoothed actual IRAS survey. The three linear CRs are generated from 
the three random realizations (1,2, & 3) used for the reconstruction of the mock 
catalog. 

Fourier or spherical harmonics/Bessel functions, where the treatment of regions of missing 
data usually results in a very complicated mode-mode coupling (Fisher et al. 1995a) The 
Wiener filter extrapolates the density field within the ZOA, using the galaxy distribution 
at the two sides of the ZOA. 

An 'ensemble' (namely, three) of NLCRs has been generated, from which one can 
estimate the variance of the different reconstructions and thereby asses the quality of 
the reconstruction. The 'observed' smoothed IRAS 1.2Jy density field is presented in 
Fig. 8 (panel labeled by R), and the three linear CRs are shown as well, all presented 
on the Supergalactic plane. Note that in the linear CRs the amplitude of the positive 
(over) density field never exceeds the observed one. The voids, on the other hand, become 
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Figure 9: NLCRs of IRAS 1.2Jy: The ensemble of the three NLCRs based on 
the IRAS 1.2Jy survey smoothed by a 1000 km/ s Gaussian window. 

more empty in the CRs because of the non-hnear correction. The three NLCRs of the 
actual universe are shown in Fig. 9. The random realizations used here are the same 
ones used for the reconstruction of the mock data (Fig. 1). The fundamental features of 
the Supergalactic plane are the Great Attractor (GA), the Perseus part of the Perseus- 
Pisces (PP) supercluster, and the Local Void (LV). The overdensity containing the Coma 
cluster at SGX, SGY ^ (0, 8000) km/s is clearly seen. However, it lies at the edge of the 
simulation and is affected by the periodic boundary conditions and therefore its physical 
characteristics cannot be studied here. The 'volume limited' galaxy distribution of the 
NLCRs as well as the IRAS 1.2Jy survey itself are shown in Fig. 10. The structure of 
peaks and the network of filaments at the GA region, on the one side of the Local Group 
and at the PP region on the other side, is a robust feature of all the NLCRs. The local 
void is recovered in all the NLCRs, but the details of the filamentary network within the 
voids vary for one realization to the other. It is in this sense that the term of 'non-linear 
reconstruction' of the local LSS can be used, namely the gross features of the LSS are 
imprinted onto the otherwise random non-linear realizations. 

To further study the cosmography recovered by the NLCRs, Aitoff projections of the 
'galaxy' distribution are plotted. Here only one NLCR (No.l) is analyzed in that way. 
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Figure 10: Volume limited IRAS 1.2Jy reconstructions: The 'R' plot shows a 
2000 km/s thick slice of the IRAS 1.2Jy redshift survey centered at (SGZ=0). The 
other plots show the ensemble of the reconstructed volume limited IRAS 1.2Jy 
surveys. 

Aitoff projections of the 'galaxy distribution' within shells of thickness of 1000 km/s at 
distances of 2000, 3000 and 4000 km/s (Fig. 11a) and 5000, 6000 and 7000 km/s (Fig. lib) 
are plotted. Contours of the smoothed field are superimposed on the 'galaxy' distribution. 
Variable smoothing is used here to compensate for the variation of the angular particle 
density with depth. The smoothing kernel radii used here are 500, 640, 780, 920, 1060, 
1200fcm/s, corresponding to 5rms = 0.80,0.67,0.56,0.46,0.41,0.37. In all projections a 
contour spacing of 0.2 is used. The identification of the various objects grossly follows the 
cosmographical analysis of Webster, Lahav and Fisher (1996), who used a linear Wiener 
filter in the spherical harmonics/Bessel representation to analyze the same data base used 
here. 

The nearby structure, at 2000 A;m/s, is dominated by the foregrounds of Centaurus at 
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Figure 11a: AitofF projections in galactic coordinates of the NLCR (7^ 1) recon- 
struction of the volume limited IRAS 1.2Jy: From top to bottom the 2000,3000 
& 4000 km/s shells. The thick curve marks the Supergalactic plane and the 
particles are within a 1000 A;m/s thick slice centered on these distances. 

(/, h) ^ (305, 30)°. The 2000 km/s plot shows also the Ursa Major (/, b) ^ (170, 25)° and 
the outskirts of the Fornax-Doradus-Eridanus complex {l,b) (180,-60)°. Underdense 
regions are identified at the two sides of the Supergalactic plane. These include the Local 
Void (330° < I < 120°, -70° < b < 60°), a void at (120° < I < 200°, -40° < 6 < 0°) 
and, a void at (210° < I < 330°, —85° < 6 < 0°), which extends over all the slices out to 
7000 km/s. 
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The structure at 3000 km/s is dominated by the GA complex which hes close to the 
Supergalactic plane and is almost aligned along the line of sight towards {I, h) pa (300, 25)°. 
The Pavo-Indus-Telescopium (PIT) complex {I, h) (330, —15)° forms an extension of the 
GA. On the other side of the Supergalactic plane the structure is dominated by two ridges, 
one in the direction of {l,h) (150,25)° connecting Ursa Major (at 2000fcm/s) with 
Camelopardalis (at 4000 /cm/s) and the other that extends all the way towards the PP at 
(Z, h) ^ (165, —15)°, which peaks at a distance of ^ 6000 km/s. 

At 4000 km/s Ccntaurus/GA ( (/, 6) ^ (315,20)°) and PIT ( (/,6) (325,-15)°) are 
the dominant overdensities, and these are connected by a filamentary structure running 
close to the Supergalactic plane at I ~ 320°, —10° <b < 10°. Other features in the plot are 
a concentration at (/, b) ~ (180, —25)° and the Camelopardalis cluster (/, b) ~ (150, 15)°. 

Going further to 5000 A;m/s the plot shows the Centaurus/GA {l,b) (320,10)°, 
a continuation of the {l,b) ^ (180,-25)° structure, the PP at {l,b) ^ (150,-15)°, a 
prominent overdensity at {I, b) ~ (275, 10)° which is quite separated from Centaurus, 
PIT (Z, b) (345, —20)° continues from the previous slice, and an overdensity at (Z, b) 
(130, —65)° which is extending out to 1000km/ s. It is not clear whether the remarkable 
concentration at (Z, b) ~ (40, —15)°, which seems to extend from 4000 km/s to 7000 km/s 
is a real object or a fluke induced by the shot noise. 

The IRAS density field is sampled within a distance of 6000 A;m/s and beyond it the 
random nature of the NLCR becomes more dominant. To study the reconstructed structure 
at these distances a more detailed comparison of the members of the 'ensemble' of NLCRs 
is needed, so that the robust features of the realizations can be separated from the random 
ones. Such an effort is beyond the scope of the present work. Here only the main objects 
of the 6000 A;m/s slice are noted. The dominant feature at this shell is the PP at (Z,6) ^ 
(165,-15)°. Also shown in this plot are the 'tail' of Centaurus/GA (/,6) ^ (320,10)°, 
Leo (Z, b) fti (270, 50)°, Cancer (/, b) ^ (190, 35)° and PIT {I, b) ^ (350, -15)°. The strong 
overdensity at {l,b) (40, —20)° extends from the 5000 A;m/s slice to the 7000 A;m/s one. 
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Figure lib: Aitoff projections in galactic coordinates of the NLCR (# 1) recon- 
struction of the volume limited IRAS 1.2Jy: From top to bottom the 5000, 6000 
& 7000 km/s shells. The thick curve marks the Supergalactic plane and the 
particles are within a 1000 /cm/s thick slice centered on these distances. 



The almost complete sky coverage of the IRAS survey, of a ZOA of \h\ < 5°, and 
the method used here enable a good restoration of the obscured structure. Indeed the 
density field found here agrees with optical studies of the ZOA, in particular the optical 
survey of the southern ZOA ( (265 — 340, -10 — 10)°) of Kraan-Korteweg and 
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her colleagues {eg. Kraan-Korteweg, Woudt and Henning 1996). 

V. Discussion 

The NLCR algorithm presented here enables one to perform controlled Monte Carlo 
N-body simulations of the formation of our 'local' universe. These are designed to recover 
the actual observational LSS within the statistical uncertainties of the data. The new 
ingredient introduced here is the reconstruction of the non-linear regime, i.e. the extrap- 
olation in Fourier space from small to large wavenumbers that are deep in the non-linear 
regime. 

The NLCR introduced here can serve as a tool for studying and analyzing the large 
scale structure of the universe. Some of the obvious problems where NLCRs are expected 
to be very useful are: (1) The reconstruction of the velocity field from redshift catalogs; 
(2) Mapping the zone of avoidance and extrapolating the dynamical fields into unobserved 
regions; (3) Studying the dynamics of observed rich clusters with the actual initial and 
boundary conditions; (4) Analysis of filaments and pancakes as probes of the initial con- 
ditions and the cosmological model; (5) The NLCR can serve as a probe of the biasing 
mechanism. The main virtue here lies in the fact that different data sets, which in principal 
can represent different biasing of the underlying dynamical field, can be used to simulta- 
neously set constraints on the realizations. Given all these and the technical simplicity of 
the algorithm we expect it to be a standard tool of N-body and gas dynamical simulations. 

The algorithm used here relies on using a Cartesian spatial representation of the 
density field. This should be considered as an attractive alternative to the spherical har- 
monics/Bessel functions representation (Fisher et al. 1995a). The later provides a rep- 
resentation that is well suited to handle full sky redshift surveys, including an elegant 
treatment of redshift distortions. However, this method has two main shortcomings. One 
is the complicated mode-mode coupling that is introduced by an incomplete sky coverage. 
The other is the effective variable smoothing of the reconstruction, which is not suitable 
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for setting initial conditions for N-body simulations. These two problems are naturally- 
solved by using the Cartesian representation, where no a priori geometry of the survey is 
assumed, and an arbitrary sky coverage of various data sets can be handled. The represen- 
tation used here provides CRs that have a constant spatial resolution, which makes it the 
optimal tool for setting initial conditions for numerical simulations. Indeed, the amplitude 
of the mean (WF) field decreases as the shot-noise errors increase with distance, but this 
is compensated by the random component of the CRs. The resulting realizations have a 
constant resolution with a constant power. The method outlined here can be extended to 
handle very large data sets by using data compression methods, in particular the signal- 
to-noise eigenmode expansion (Zaroubi 1995, Vogeley and Szalay 1996,Tegmark, Taylor, 
& Heavens 1996). Using this representation the dimensionality of the data space can be 
significantly reduced, and the WF / CR algorithm can be easily formulated in the reduced 
data space. A shortcoming of the present work is the neglect of redshift distortions. How- 
ever, these can be handled by using correlation matrices of Eq. 2, properly expressed 
in terms of redshift space dynamical variables. The general formalism of evaluating the 
redshift space variables, within the linear theory, was outlined by Zaroubi and Hoffman 
(1996). 

At the time this paper had been originally submitted Kolatt et al. (1996) reported 
on a similar project of NLCR of the IRAS 1.2Jy catalog. Their procedure differs from 
the present one mainly in not distinguishing between the low resolution (data) and high 
resolution (realizations). The input data is smoothed on the 500 km/s scale and is heavily 
dominated by the noise, which is 'removed' by a power preserving filter (PPF) which 
is a modified WF. The PPF is designed to preserve the power, regardless of the noise 
level. The resulting estimator is therefore more dominated by the noise and less by the 
prior model compared to our method. Next, the resulting 500 km/s PPFed field is taken 
from the quasi- to the linear regime by the Nusser and Dekel (1992) 'time machine', 
where a further Gaussianization is applied to the linear field. Small scale structure is 
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then added to the hnearized and Gaussianized 500 A;m/s-smoothed field by constrained 
reahzations. In comparison to the Kolatt et al. the present algorithm is more rigorous and 
consistent with the theoretical framework. Our procedure involves only one ad hoc step, 
namely the linearization procedure, where Kolatt et al. used the PPF, linearization and 
Gaussianization to obtain the 500 A;m/s-smoothed field. Yet, Kolatt et al. used a better 
and more consistent linearization procedure, which can replace the one used here. In spite 
of the very different approaches it seems that both methods yield similar results and are 
equally efficient. Detailed comparisons against N-body simulations and mock catalogs are 
needed to judge the merits of each method. 
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